Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Modern model hubs, such as Hugging Face, store tens of petabytes of LLMs, with fine-tuned variants vastly outnumbering base models and dominating storage consumption. Existing storage reduction techniques---such as deduplication and compression---are either LLM-oblivious or not compatible with each other, limiting data reduction effectiveness. Our large-scale characterization study across all publicly available Hugging Face LLM repositories reveals several key insights: (1) fine-tuned models within the same family exhibit highly structured, sparse parameter differences suitable for delta compression; (2) bitwise similarity enables LLM family clustering; and (3) tensor-level deduplication is better aligned with model storage workloads, achieving high data reduction with low metadata overhead. Building on these insights, we design BitX, an effective, fast, lossless delta compression algorithm that compresses XORed difference between fine-tuned and base LLMs. We build ZipLLM, a model storage reduction pipeline that unifies tensor-level deduplication and lossless BitX compression. By synergizing deduplication and compression around LLM family clustering, ZipLLM reduces model storage consumption by 54%, over 20% higher than state-of-the-art deduplication and compression approaches.more » « lessFree, publicly-accessible full text available May 4, 2027
-
The deployment of deep learning-based malware detection systems has transformed cybersecurity, offering sophisticated pattern recognition capabilities that surpass traditional signature-based approaches. However, these systems introduce new vulnerabilities requiring systematic investigation. This chapter examines adversarial attacks against graph neural network-based malware detection systems, focusing on semantics-preserving methodologies that evade detection while maintaining program functionality. We introduce a reinforcement learning (RL) framework that formulates the attack as a sequential decision making problem, optimizing the insertion of no-operation (NOP) instructions to manipulate graph structure without altering program behavior. Comparative analysis includes three baseline methods: random insertion, hill-climbing, and gradient-approximation attacks. Our experimental evaluation on real world malware datasets reveals significant differences in effectiveness, with the reinforcement learning approach achieving perfect evasion rates against both Graph Convolutional Network and Deep Graph Convolutional Neural Network architectures while requiring minimal program modifications. Our findings reveal three critical research gaps: transitioning from abstract Control Flow Graph representations to executable binary manipulation, developing universal vulnerability discovery across different architectures, and systematically translating adversarial insights into defensive enhancements. This work contributes to understanding adversarial vulnerabilities in graph-based security systems while establishing frameworks for evaluating machine learning-based malware detection robustness.more » « lessFree, publicly-accessible full text available December 1, 2026
-
Abstract Non-adiabatic molecular dynamics (NAMD) simulations have become an indispensable tool for investigating excited-state dynamics in solids. In this work, we propose a general framework, N2AMD (Neural-Network Non-Adiabatic Molecular Dynamics), which employs an E(3)-equivariant deep neural Hamiltonian to boost the accuracy and efficiency of NAMD simulations. Distinct from conventional machine learning methods that predict key quantities in NAMD, N2AMD computes these quantities directly with a deep neural Hamiltonian, ensuring excellent accuracy, efficiency, and consistency. N2AMD not only achieves impressive efficiency in performing NAMD simulations at the hybrid functional level within the framework of the classical path approximation (CPA), but also demonstrates great potential in predicting non-adiabatic coupling vectors and suggests a method to go beyond CPA. Furthermore, N2AMD demonstrates excellent generalizability and enables seamless integration with advanced NAMD techniques and infrastructures. Taking several extensively investigated semiconductors as the prototypical system, we successfully simulate carrier recombination in both pristine and defective systems at large scales where conventional NAMD often significantly underestimates or even qualitatively incorrectly predicts lifetimes. This framework offers a reliable and efficient approach for conducting accurate NAMD simulations across various condensed materials.more » « lessFree, publicly-accessible full text available December 1, 2026
-
Abstract The large variety of inflorescence architectures evolved in grasses depends on shape, longevity and determinacy of meristems directing growth of the main and lateral axes. The CLAVATA pathway is known to regulate meristem size and inflorescence architecture in grasses. However, how individual meristem activities are determined and integrated to generate specific inflorescences is not yet understood. We found that activity of distinct meristems in the barley inflorescence is controlled by a signalling pathway comprising the receptor-like kinaseHordeum vulgareCLAVATA1 (HvCLV1) and the secreted CLAVATA3/EMBRYO-SURROUNDING REGION RELATED (CLE)-family peptide FON2-LIKE CLE PROTEIN1 (HvFCP1). HvFCP1 and HvCLV1 interact to promote spikelet formation, but restrict inflorescence meristem and rachilla proliferation.Hvfcp1orHvclv1mutants generate additional rows of spikelets and supernumerary florets from extended rachilla activity.HvFCP1/HvCLV1signalling coordinates meristem activity through regulation of trehalose-6-phosphate levels. Our discoveries outline a path to engineer inflorescence architecture via specific regulation of distinct meristem activities.more » « lessFree, publicly-accessible full text available December 1, 2026
-
In high-performance computing (HPC), modern supercomputers typically provide exclusive computing resources to user applications. Nevertheless, the interconnect network is a shared resource for both inter-node communication and across-node I/O access, among co-running workloads, leading to inevitable network interference. In this study, we develop MFNetSim, a multi-fidelity modeling framework that enables simulation of multi-traffic simultaneously over the interconnect network, including inter-process communication and I/O traffic. By combining different levels of abstraction, MFNetSim can efficiently co-model the communication and I/O traffic occurring on HPC systems equipped with flash-based storage. We conduct simulation studies of hybrid workloads composed of traditional HPC applications and emerging ML applications on a 1,056-node Dragonfly system with various configurations. Our analysis provides various observations regarding how network interference affects communication and I/O traffic.more » « lessFree, publicly-accessible full text available September 12, 2026
-
Free, publicly-accessible full text available July 20, 2026
-
Free, publicly-accessible full text available June 22, 2026
-
Abstract Cases of convergent adaptation, especially between close relatives within a lineage, provide insights into constraints underlying the mechanisms of evolution. We examined this in the carnivorous plant family Lentibulariaceae, with its highly divergent trap designs but shared need for prey digestion, by generating a chromosome-level genome assembly for Pinguicula gigantea, the giant butterwort. Our work confirms a history of whole-genome duplication in the genus and provides strong phylogenomic evidence for a sister-group relationship between Lentibulariaceae and Acanthaceae. The genome also reveals that a key digestive adaptation, the expansion of cysteine protease genes active in digestion, was achieved through independent tandem duplications in the butterwort (Pinguicula) and its close relative, the bladderwort (Utricularia). Most of these parallel expansions arose in non-homologous regions of the two genomes, with a smaller subset located on homologous blocks. This study provides clear genomic evidence for convergent evolution and illustrates how similar selective pressures can repeatedly shape genomes in analogous ways.more » « lessFree, publicly-accessible full text available September 9, 2026
-
In this paper, we propose and study first- and second-order (in time) stabilized linear finite element schemes for the incompressible Navier-Stokes (NS) equations. The energy, momentum, and angular momentum conserving (EMAC) formulation has emerged as a promising approach for conserving energy, momentum, and angular momentum of the NS equations, while the exponential scalar auxiliary variable (ESAV) has become a popular technique for designing linear energy-stable numerical schemes. Our method leverages the EMAC formulation and the Taylor-Hood element with grad-div stabilization for spatial discretization. We adopt the implicit-explicit backward differential formulas (BDFs) coupled with a novel stabilized ESAV approach for time stepping. For the solution process, we develop an efficient decoupling technique for the resulting fully-discrete systems so that only one linear Stokes solve is needed at each time step, which is similar to the cost of classic implicit-explicit BDF schemes for the NS equations. Robust optimal error estimates are successfully derived for both velocity and pressure for the two proposed schemes, with Gronwall constants that are particularly independent of the viscosity. Furthermore, it is rigorously shown that the grad-div stabilization term can greatly alleviate the viscosity-dependence of the mesh size constraint, which is required for error estimation when such a term is not present in the schemes. Various numerical experiments are conducted to verify the theoretical results and demonstrate the effectiveness and efficiency of the grad-div and ESAV stabilization strategies and their combination in the proposed numerical schemes, especially for problems with high Reynolds numbers.more » « lessFree, publicly-accessible full text available May 28, 2026
An official website of the United States government
